The stochastic behavior of a molecular switching circuit with 
feedback 



Supriya Krishnamurthy* 12 , Eric Smith* 3 , David Krakauer 3 and Walter Fontana 4 



Swedish Institute of Computer Science, Box 1263, SE-164 29 Kista, Sweden 

2 Department of Information Technology and Communication, KTH — Royal Institute of Technology, SE-164 40 Kista, Sweden 
3 Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA 

4 Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston MA 02115 

Email: Supriya Krishnamurthy*- supriya@sics.se; Eric Smith*- desmithOsantafe.edu; David Krakauer - krakauer@santafe.edu; Walter 
Fontana - walter@hms.harvard.edu; 

'Corresponding author 



Abstract 



Background: Using a statistical physics approach, we study the stochastic switching behavior of a model circuit 
of multisite phosphorylation and dephosphorylation with feedback. The circuit consists of a kinase and 
phosphatase acting on multiple sites of a substrate that, contingent on its modification state, catalyzes its own 
phosphorylation and, in a symmetric scenario, dephosphorylation. The symmetric case is viewed as a cartoon of 
conflicting feedback that could result from antagonistic pathways impinging on the state of a shared component. 

Results: Multisite phosphorylation is sufficient for bistable behavior under feedback even when catalysis is linear 
in substrate concentration, which is the case we consider. We compute the phase diagram, fluctuation spectrum 
and large-deviation properties related to switch memory within a statistical mechanics framework. Bistability 
occurs as either a first-order or second-order non-equilibrium phase transition, depending on the network 
symmetries and the ratio of phosphatase to kinase numbers. In the second-order case, the circuit never leaves 
the bistable regime upon increasing the number of substrate molecules at constant kinase to phosphatase ratio. 

Conclusions: The number of substrate molecules is a key parameter controlling both the onset of the bistable 
regime, fluctuation intensity, and the residence time in a switched state. The relevance of the concept of 
memory depends on the degree of switch symmetry, as memory presupposes information to be remembered, 
which is highest for equal residence times in the switched states. 

Reviewers: This article was reviewed by Artem Novozhilov (nominated by Eugene Koonin), Sergei Maslov, and 
Ned Wingreen. 
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Background 

Switching plays an important role in cell cycle progression [1,2], cellular differentiation [3-5], and neuronal 
memory [6]. It is therefore a critical property of molecular signal transduction and gene expression circuits. 
Ultras ensitivity [7, 8] underlies one concept of switching defined by a sharp sigmoidal but continuous 
response in the concentration of a molecule over a narrow range of a (stationary) signal. A steep 
ultrasensitive response, however, leads to "chatter" when the signal fluctuates across the setpoint [9]. In 
contrast, bistability [10] is a form of switching made possible when two stable states, Si and S%, co-exist 
over a signal range. As a consequence, bistable systems exhibit two distinct thresholds as the signal is 
varied, one at which a transition occurs from S\ to 5*2 and another at which the system switches back from 
S2 to Si. The separation of thresholds leads to path dependence or hysteresis, and makes a switched state 
more impervious to stochastic fluctuations of the signal around the transition point. The difference 
between bistable and ultrasensitive switching is illustrated in Fig. [2l where the red curves show switching 
through bistability and the green curve shows the ultrasensitive form of switching. 

Stochasticity in biological processes at the molecular scale has attracted recent attention both 
experimentally and theoretically [9,11-15]. In this contribution we present a stochastic treatment of 
bistable switching generated by a reaction network based on a kinase and a phosphatase that 
phosphorylate and dephosphorylate, respectively, a substrate at multiple sites. The bistability arises from 
symmetric or asymmetric autocatalytic feedback whereby the substrate catalyzes its own phosphorylation 
and dephosphorylation (symmetric case), or it catalyzes one but not the other process (asymmetric case); 
see Fig. [T]for a preview. In statistical physics terminology, the bifurcation from monostablc to bistable 
behavior appears as a non-equilibrium phase transition. Our first step is to derive analytical expressions 
for the phase diagram (the regions in parameter space where the system exhibits bistable switching) using 
a mean-field approximation, in which the consequences of correlations among fluctuations are ignored. We 
then improve on this by taking fluctuations more accurately into account. A major objective is to 
understand how network structure and molecule numbers affect the thresholds (critical points) of the 
switch and the longevity of its memory through intrinsic fluctuations. Some of the analytical results 
germane to this objective are obtained with a field-theoretic approach whose technical details arc laid out 
in a forthcoming paper [16]. Here we validate these calculations with numerical simulations and discuss 
their significance. 

There are two possible meanings for the term "switching" in a bistable system. In one usage the 
circuit abruptly changes from monostability to bistability (from one to two possible, long-lived macroscopic 
phosphorylation states), or vice versa, in response to the variation of a parameter. Such a bifurcation in 
the behavioral repertoire of a system provides a ratchet-like "checkpoint" [17]. The other usage of 
switching refers to the spontaneous fluctuation-induced transition from one macroscopic phosphorylation 
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state to the other within the bistable regime. This corresponds to a cell losing memory of a dynamical state 
because of noise intrinsic to cellular processes at low molecule numbers. In a stochastic setting, the number 
of substrate molecules affects both the transition from monostability to bistability and also the residence 
time in a macroscopic phosphorylation state. Clarifying the role played by fluctuations in these two cases of 
switching will lead to a better understanding of how reliable, device-like, macroscopic behavior can emerge 
from low- level stochastic molecular interactions - a theme dating back at least to von Neumann [18]. 

In summary, we consider a highly stylized kinetic mechanism of bistability that permits a fairly 
detailed analytical treatment of its stochastic behavior. We use techniques from non-equilibrium statistical 
mechanics, which may seem unfamiliar to biologists. Yet, these methods, in conjunction with a model 
emphasizing a pattern more than biological literalism, convey useful lessons in stochastic reaction-kinetics 
of biological systems. 

Model and biological context 

Molecular signal transduction involves the covalent modification of proteins, giving rise to chemically 
distinct protein states. A widely occuring modification is phosphorylation, which is the transfer of a 
phosphate group from an ATP molecule to a tyrosine, serine or threonine residue (referred to as 
phosphoacccptors) of a target protein by means of a kinase. The 1663 proteins documented in version 3.0 
of the Phospho.ELM database [19] contain between 1 and 26 phosphorylatable sites per protein. A 
"serine/arginine repetitive matrix 2" protein (SwissProt accession Q9UQ35) exhibits 96 sites. A list of 
proteins with 10 phosphorylatable sites includes, for example, a glutamate-gated ion channel, lamin, 
glycogen synthase, DNA topoisomcrase 2, insulin receptor, and various protein kinases. 

In the system considered here, the target protein is present in a fixed number of copies. Both 
phosphorylation and dephosphorylation events are catalyzed concurrently by kinase and phosphatase 
enzymes whose numbers are also fixed at the outset. In current terminology [11,20], we focus on the 
intrinsic noise of the reaction system - the fluctuations in the microscopic phosphorylation states of proteins 
due to the probabilistic nature of chemical recations at fixed component numbers - not the additional 
(extrinsic) noise caused by fluctuations in the numbers of target, kinase, and phosphatase molecules. 

The fully phosphorylatcd form of a protein often exhibits a distinct catalytic activity, possibly of the 
kinase or phosphatase type. For example, at each level of the mitogen-activated protein kinase (MAPK) 
cascade, a signaling circuit widely duplicated and diversified within eukaryotes [21], the fully 
phosphorylated form of the target protein acquires the ability to act as a kinase for the subsequent level. 
In this paper we will only study a single level or target protein with multiple phosphoaccepting sites. 

The fully phosphorylated form of a protein may directly or indirectly promote its own 
phosphorylation. This is again the case in the MAP kinase cascade, where such feedback reaches across 
levels. Once fully phosphorylated, the protein at the third level of the cascade catalyzes the accumulation 
of the kinase at the first level [4], thereby feeding back on its own activation. A positive feedback with a 
suitable nonlinearity leads to bistability [4], and the cascade architecture provides that nonlinearity. 
Bistability can also occur with linear feedback, as in a single phosphorylation / dephosphorylation loop, 
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Figure 1: Multisitc phosphorylation and feedback structure. 



wherein the phosphorylated substrate catalyzes its own phosphorylation, provided the phosphatase 
reaction is saturable [6]. If, on the other hand, all reaction velocities are linear in the substrate, as is the 
case for large Michaelis constants and small substrate concentrations, the needed nonlinearity can still be 
generated by multisite phosphorylation. This is the case studied here. We mention for completeness that 
multisite phosphorylation with saturation kinetics at each modification step can lead to bistability even in 
the absence of feedback [22,23]. 

We thus consider a protein with J sites that are phosphorylated by a kinase and dephosphorylated 
by a phosphatase, as illustrated in Fig. [1^. The numbers of kinase molecules, /, and phosphatase 
molecules, P, arc fixed. The enzymes are assumed to operate in the linear regime where complex formation 
is not rate limiting. The site modifications occur in a specific order, thus sidestepping combinatorial 
complexity. Furthermore, phosphorylation (dcphosphorylation) of substrate proteins is assumed to follow a 
distributive mechanism, whereby a kinase (phosphatase) enzyme dissociates from its substrate between 
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subsequent modification events [24,25]. These assumptions lead to J + 1 states for the substrate. The 
phosphorylation chain with feedback is shown in the bottom half of Fig. [TJ Panel (c) depicts an 
asymmetric topology in which the fully phosphorylated substrate catalyzes its own phosphorylation, while 
panel (d) shows the symmetric version in which a substrate molecule is bifunctional. acting as both a 
kinase and phosphatase depending on its modification state. Kinase / and phosphatase P are exogenous 
forces on the modification of the substrate, but the feedback is an endogenous force whose strength is 
proportional to the occupancy of the end-states of the chain. This occupancy is subject to intrinsic 
fluctuations and depends on the total number of substrate molecules. 

Although each modification step is kinetically linear in both the number of substrate molecules and 
modification enzymes, the phosphorylation / dephosphorylation chain of Fig. QJi (no feedback) responds in 
a nonlinear fashion to a change in the ratio of kinase to phosphatase numbers. Absent feedback, the chain 
behaves like a random walk with drift and reflecting barriers in the space of substrate phosphoforms. The 
random walker hops to the right (phosphorylation) with probability 1/(1 + P) and to the left 
(desphosphorylation) with probability P/(I + P). As a result of drift there is a geometric multiplier x 
linking the expected occupancies of phosphoforms. In the absence of feedback, this multiplier is simply 
J/P, the ratio of kinase to phosphatase numbers. For all mean-field treatments, drift causes the expected 
occupancy of molecules in adjacent phosphorylation states to satisfy 

{Uj} _ (ttj-i) 

N ~ N [ ' 

Thus the expected fractions of fully phosphorylated, (nj) /N, and unphosphorylated substrate molecules, 
(n ) /N, are related as 

TV N v ' 

The geometric progression is a conventional result of detailed balance. Since by construction 



When the fully posphorylated end-state feeds back on the chain, it alters x, affecting end-state occupancy 
as described in equation ((3]). As the end-state occupancy increases, the efficacy of feedback catalysis in 
moving phosphoforms to the right eventually diminishes as the chain runs out of unphosphorylated 
substrate. The relevant observation here is that even without the individual reactions exhibiting saturation 
kinetics, a multisitc phosphorylation chain with J > 2 produces sufficient nonlincarity for feedback to 
induce bistability. The addition of feedback changes only the functional form of x in the preceding 



J2j=o ( n j) — 1) we obtain 




(3) 



As the number of sites J tends to infinity, we observe 
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equations, as detailed below. 
Remarks on realism 

We have chosen phosphorylation over other types of post-translational modification for the sake of 
concreteness. The feedback topology of the model caricatures a few elements present in biological systems. 
One such element is the competition between antagonistic pathways that may underlie cellular decision 
processes (for example [26]). A multisitc phosphorylation chain of the type considered here could function 
as an evaluation point between competing and antagonistic pathways influenced by different active 
phosphoforms of the chain, provided these pathways feed back to the chain. In a less extreme case, the 
fully phosphorylated form activates another kinase which then interacts with the chain. In these scenarios, 
feedback is mediated by a series of intervening processes, which may well affect the propagation of 
fluctuations. Yet, if delays are not too large, the collapsed scheme of Fig. [TJ: could be a reasonable proxy 
with the added benefit of mathematically tractability. 

A scenario corresponding more literally to our model involves a bifunctional substrate capable of 
both kinase and phosphatase activity, depending on the substrate's modification state. One example is the 
HPr kinase/P-Ser-HPr phosphatase (HprK/P) protein, which operates in the 

phosphoenolpyruvate:carbohydratc phosphotransferase system of gram-positive bacteria. Upon stimulation 
by fructose-l,6-bisphosphatc, HprK/P catalyzes the phosphorylation of HPr at a seryl residue, while 
inorganic phosphate stimulates the opposing activity of dephosphorylating the seryl-phosphorylatcd HPr 
(P-Ser-HPr) [27]. Another example of a bifunctional kinase/phosphatase is the NRII (Nitrogen Regulator 
II) protein. It phosphorylatcs and dcphosphorylates NRI. NRI and NRII constitute a bacterial 
two-component signaling system, in which NRII is the "transmitter" and NRI the "receiver" that controls 
gene expression. NRII autophosphorylates at a histidine residue and transfers that phosphoryl group to 
NRI. The phosphatase activity of NRII is stimulated by the PII signaling protein (which also inhibits the 
kinase activity). Several other transmitters in bacterial two-component systems seem to possess 
bifunctional kinase/phosphatase activity [28]. 

This completes the sketch of the feedback circuit. We next discuss its statistical properties. 

Results and Discussion 

Formal definitions and result overview 

The state diagram of the circuit is shown in Fig. [TJi. Sites are indexed by j £ 0, . . . , J representing 
phosphoforms of the target protein. We describe a population of N target proteins with a vector n = (rij), 
where component rij is the number of proteins in phosphorylation state j and X)/=o 71 j = N ■ The 
distribution of phosphoforms in the protein population changes in time because independent 
phosphorylation and dephosphorylation events cause individual proteins to hop from a state to a 
neighboring one along the state chain. We assume that both kinase and phosphatase actions occur on a 
time scale of 1 unit. Phosphorylations (hops from state j to j + 1) are catalyzed at rate I + fjnj, where / 
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Figure 2: Ultrasensitive and bistable behavior in the symmetric feedback circuit with J = 2. 



is the number of kinase molecules and the second term describes the feedback resulting from the fully 
phosphorylated state j = J. The intensity of this feedback is assumed to be proportional to the 
instantaneous occupation nj. The feedback is therefore a fluctuating variable, whose strength is set 
implicitly by the total number of target molecules N. Similarly, dcphosphorylation transitions (backward 
hops from site j to j — 1) occur at rate P + fono, where P is the number of phosphatase molecules and no 
is the number of target proteins in state 0. fo and fj are measures of the feedback strengths of sites and 
J, respectively. In all that follows we only deal with the 2 cases fo = fj = 1, depicted in Fig.QJl and 
referred to as symmetric topology, and fo = 0, fj = 1, depicted in Fig. [TJ; and referred to as asymmetric 
topology. It is straightforward to generalise the analysis to other cases. 

We proceed with an overview of the salient features of this switch as we vary two kinds of 
asymmetries: parameter asymmetry, I / P, and circuit asymmetry. These features are obtained from the 
analysis detailed in the next section. In a deterministic setting, the transition from a single stable 
steady-state (monostability) to one with two stable steady-states (bistability) is a saddle-node bifurcation, 
while in statistical mechanics it appears as a non-equilibrium phase transition. In both symmetric and 
asymmetric circuits such a phase transition occurs when the ratio of feedback catalysis (the strength of 
which is set by N) to exogenous catalysis (set by I and P) is varied. The number of target molecules N 
therefore constitutes a degree of freedom for switch control. Simulations, carried out with a simple 
procedure described in the Methods section, indicate that (for J ~ 9 or greater) TV ~ 10 is enough in 
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practice to start showing a sharp transition in all cases. We find analytically that bistability occurs for any 
J > 2, despite the enzymes' operating in the linear regime. 

The case I = P = q for the symmetric circuit has a second-order phase transition between 
monostability and bistability as the ratio g = N/q is varied, that is, the transition occurs at a critical point 
at which the order parameter \(n,j — n ) \ /N changes continuously, see Fig. O (and blue curve in Fig. 2]). 
However, the phase transition is different for finite J and for infinite J. We do not know of any other 
statistical non-equilibrium model which shows this behavior. The infinite- J case, although not directly 
relevant as a biological model, remains important as it constitutes the asymptotic behavior to which all the 
finite- J solutions converge deep within the bistable regime, where the dynamics of the circuit involves 
nearly exclusively one boundary (all phosphorylated or all unphosphorylatcd) , so that the configuration 
space is effectively semi-infinite. 

Parameter asymmetry, I ^ P, or circuit asymmetry also generate transitions from monostability to 
bistabilty; but the transition is now first-order (the order parameter changes discontinuously), see Fig. [5Jd 
and Fig. [U 

For the symmetric circuit and I = P, the monostable phase is one in which all the N target 
molecules are distributed homogenously over the phosphorylation states, while the two degenerate bistable 
states are those where the population is concentrated on either end of the chain. The character of the 
monostable state in cases with any type of asymmetry is different from the symmetric case, in that the N 
particles follow a density profile peaked around j = or j = J (depending on the parameters). In Fig. [3] we 
show the phase diagram for symmetric and asymmetric circuits as a function of parameter asymmetry. 

We can solve this model not only for the mean behavior, but also for the fluctuations as well as the 
residence time in the bistable regime. The latter is of biological interest since it addresses the persistence of 
switched states (memory) when realized by small numbers of molecules [29] . 
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Theoretical analysis 

The system is entirely specified by the probability Pr(n), where n = (no, ■ ■ ■ , rij) is the vector of 
occupancies introduced above. The master equation describing the hopping process in the space of 
phosphoforms for the symmetric circuit (Fig. [TJi) is: 

d J ~ 1 

— Pr(n) = [( J + n -' - s U+i) K + !) Pr ( n + h - h+i) + n j) % Pr (") 

3=0 

+ (P + n Q - Soj) (n j+1 + 1) Pr(n - lj + - (P + n ) n i+1 Pr(n) . (5) 

Here lj represents the vector of zeros with a 1 in the j th position, and 8j y — if j ^ j and 
Sji ./ = 1 (Kronecker delta). In all that follows, we are interested in the stationary state where the 
left-hand-side of Eq. [5] can be set to 0. 

Eq. ([SI) is difficult to solve exactly for J > 1, however it is amenable to a mean- field approximation. 
The idea behind the mean-field theory is to reduce the system of ./V target molecules interacting through a 
fluctuating feedback, to N non-interacting particles acted on by an averaged feedback. In this 
approximation we replace the modification rates P + no and / + nj in ([5]) by effective rates P + (no) and 
/ + (nj), respectively, where the ( ) denotes expectation in the stationary state. This is identical to the 
spirit of mean-field theory for Ising spin systems where the fluctuating field felt by a spin (on account of its 
neighbours) is replaced by an average field, which is then determined self-consistcntly. The reader will find 
a brief summary of the mean-field approach in the Methods section. 

It is helpful to define a generating function F(z) = F(zo, Zi,Z2 • • • , Zj) = P r (^) Zq z™ 1 ....Zj J , 

where the sum is over all configurations {n} that preserve the total number of target molecules. In the 
Methods section we solve Eq. ([S]) for Pr(n) in the mean-field limit, and obtain the generating function for 
N substrate molecules as: 

F{ Z )= {Z ° +XZl \f^ )N + XJZj)N {1- X ) N (6) 

where we define 

x = (e A/2 + 9 (nj) /N) / (e^ 2 + g (n ) /iv) , g = N/VTP, and e x = I /P. (7) 

The expression for x is just a rewrite of x = (I + (nj))/(P + (no)) in terms of the coupling strength 
(control parameter) g. The definition of the kinase to phosphatase ratio as an exponential enables 
convenient use of hyperbolic sines below. Eq. (J6j) obeys the constraints F(l, 1, • • • , I) = 1 (probability 
conservation) and X)/=o dF/dzj = Yli=o n i = N (particle number conservation). 

J z=l J 

From Eq. [6]one obtains (rij) = x (rij_i) = x? (no), and N = (no) (l — x J+1 ) / (1 — x). By defining 
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wc find 



2sinh ( -£) sinh ( -f 
(nj-no) V2/ \2 



N 



[ — ^~ £ 



(8) 



From = a;- 3 (no) and the definition of x, we obtain an expression for g (nj — n ) /N, which, when 
combined with Eq. [SJ enables us to write g in terms of the parameter £: 



I ^2 J ") sinh ("~y~^ 



.9 = 



I | j sinh (— ^£ 



(9) 



In Fig. [J] we invert Eq. [5] to graph £ = logx as a function of g, showing the transitions to bistability for the 
symmetric circuit at different values of A. 




Figure 4: as obtained from inverting Eq. [5] for J = 2. 



In all that follows, we consider the simplest case of I = P (A = 0). Eq. [5] simplifies to 



( — y-f) / sinh \~Y~ S ' 



(10) 
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We can now solve for the order parameter | (nj — uq) \ /N as a function of the parameter g for any J. When 
I = P, we find that for J > 2 there is a critical coupling 



(J+1)/(J-1), 



(11) 



such that for g < g c the uniform distribution £ = with nj/N = 1/ (J + 1) for all j is stable, while for 
g > g c the symmetric distribution is a saddle point and the £ ^ solutions are stable. These results are 
readily extended to the asymmetric circuit, yielding the phase diagrams of Fig.[3l 
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Figure 5: Order parameter \ (nj — no) \ /N for the case I = P in the symmetric circuit. 



In Fig. [5] the theoretical estimate of the order parameter from the mean- field approximation is 
compared against simulations for different values of J. It is seen that the mean-field estimate is very 
accurate for sufficiently large N. The transition at finite J has the character of a Curie- Weiss 
magnetization transition [30], with order parameter scaling as 



inj - n ) 



6J [g__ 1 



1/2 



N J + l V g c 

For any J at g > 1 + 2 (g c — 1), the order parameter saturates to a J-indcpendcnt envelope value 



\(nj - n )| 
N 



1 

1 - -. 
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(12) 



(13) 



Since g c — 1 — » 2/ J for large J, Eq. fTS"]) also gives the behavior in the formal J — > oo limit. The derivative 
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of the order parameter converges to 1 in arbitrarily small neighborhoods of the critical point, rather than 
to oo as in the Curie- Weiss regime; thus J — > oo defines a different universality class than any finite J. 
Qualitatively, the distinction between large and small J is determined by whether one or both reflecting 
boundaries, respectively, are sensed by the near-critical symmetry-broken state. 




Figure 6: Fluctuations in the order parameter. 



So far we have described the mean behavior of the system. Eq. [6] also makes predictions for higher 
moments, in particular the variance of the end-site occupancies a 2 in') — (^(uq — nj) 2 ^ — (uq — nj) 2 . It is 
easy to show that for large g, a 2 (n 1 ) ~ N/g (or the noise a 2 /N 2 ~ l/Ng = VTP/N 2 ). As shown in Fig. [6l 
this is a fairly accurate prediction for any J. At large J, the form (N/g)(l — g~ 2 ' 5 ) represents an excellent 
numerical fit for the variance all the way down to the critical g c . 

For a given distance not too far from the critical point, the noise in the order parameter increases 
with decreasing J. At large g, far inside the bistable region, the variances for all J converge as indicated 
above and shown in Fig. [6l Interestingly, for large J the noise increases to a maximum past the critical 
point (red curve in Fig. [6]), while for small J it decreases. At a phcnomenological level, we may distinguish 
between fluctuations and large-deviations. Roughly speaking, a large-deviation state is one that is very 
unlikely to be reached, but once reached, exhibits a certain duration (residence time). This is the case for 
the spontaneous switching between stationary states in the bistable regime. A fluctuation does not 
necessarily have such distinct time scales. At or near criticality, however, there is no clear-cut separation 
between fluctuations and large-deviations. We return to the topic of large-deviations below. 

Fig. [5] also demonstrates that the variance diverges at the critical point. To understand this, we need 
to go beyond mean-field theory. We can do this by introducing an operator algebra on a basis of number 
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states [31,32], as done for many reaction-diffusion systems in physics, and writing the master equation in 
terms of these number operators. This technique has been recently applied to models of simple gene 
expression circuits [33]. The technical development of our approach based on the formalism of field-theory 
and its application to molecular signal transduction is the main subject of a forthcoming manuscript [16]. 
Here we only sketch the spirit of the approach. 

In principle, the master equation [5] contains the full content of the stochastic process description of 
our system. The discrete master equation, however, is a function of the instantaneous occupancies rij for 
any particular j, and does not lend itself to study of collective changes of phosphorylation state across the 
whole range of j values. One purpose for field-theoretic methods is to re-express the content of the master 
equation in terms of the elementary modes of collective state change. Within this formalism, fluctuations 
in the phosphorylation state of the circuit can be expanded in a set of basis functions (modes) of the 
reaction-diffusion operator associated with the master equation. This treatment identifies the lowest mode 
of the diffusion equation as the collective fluctuation that induces the instability as the coupling g 
approaches the critical value g c . At the critical point, this mode becomes a linear function of j, so that its 
fluctuations describe an aggregate transformation of the substrate population upward or downward in 
phosphorylation state, collectively adding or subtracting a linear term to the phosphorylation state 
distribution. All other modes of diffusion among phosphorylation states decay through the bistability 
transition, almost as they decay in the monostable region. This form of understanding the mechanism 
underlying instability in a stochastic circuit is more natural in a field-theoretic representation than in the 
master equation [5] 

Within the field-theory formalism, a systematic perturbation theory can be developed about the 
mean-field limit, yielding expressions for the variance a 2 (n 1 ) plotted in Fig. [6] (solid lines). This result 
confirms the similarity of the finite- J transition to Curie- Weiss fcrromagnctism, with a divergence at the 
critical point scaling as 



We can also compute the large- N dependence of the average residence time r. This is the average 
time a state in the bistable regime will last before spontaneous fluctuations cause a transition to the other 
state. The residence time is related to the memory (stability) of a stochastic switch. The persistence of a 
switched state has been recently studied in other contexts both numerically [34, 35] and 
theoretically [29,36,37]. We solve the master equation of the system, using the field-theoretic formalism 
described in [16], and obtain the leading large- N dependence of the residence time rasT~ e N f(g> J ) ^ where 
the function f(g, J) is independent of N at fixed g and J. We have numerically evaluated analytical 
expressions for Nf(g, J), obtained from first principles at J = 2, and find good agreement with 
Monte-Carlo simulations, as shown in Fig. [JJ 

These simulations were carried out with a rate constant of 1 for each elementary 
(dc)phosphorylation step. To estimate biological time scales, let us assume a catalytic rate constant of 
phosphorylation, k cat , of 0.01 s _1 [22] (another frequently used value is 1.5 s _1 [8,22]). Dcphosphorylation 
rate constants are roughly an order of magnitude higher [35]. These rate constants refer to the first-order 
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Figure 7: Residence times r in the switched state as a function of J, N, and the distance from the critical 
point g - g c . 



conversion of an enzyme-substrate complex into enzyme and (de)phosphorylated product. Our model, 
however, does not include complex formation between enzyme and substrate, and (de)phosphorylation 
reactions are second-order. Because kinase and phosphatase concentrations remain unchanged at each step, 
we may absorb them into a pseudo first-order rate constant. To estimate that rate constant, we interpret 
our model as operating in the linear Michaelis-Menten regime, where the pseudo first-order rate constant k 
becomes k ca t[I]/ K m , with K m the Michaelis constant of the enzymatic reaction and [I] the concentration 
of kinase molecules. Ballpark figures for K m are 50 to 500 nM [22]. 1 nM corresponds to about 0.6 
particles per femtoliter (fl), or roughly 25 particles per yeast cell volume (42 fl on average). In the 
simulations and calculations of Fig.[7H we have I = P = N/g, to keep the coupling g constant as N 
changes. The range of the abscissa in Fig. [7^ therefore corresponds to kinase concentrations of 0.5 to 2 nM, 
assuming a yeast cell volume. To roughly estimate spontaneous switching times, consider the symmetric 
circuit with J = 2, not too far past the critical point, and N = 50 target particles corresponding to 2 nM, 
/ = P = 12 kinase (phosphatase) particles corresponding to 0.5 nM, and r ~ (l/fc)e 5,2 = 181/fc seconds. 
Using k = k ca t[I}/ K m = 0.01 • 0.5/50 = 10~ 4 s _1 , we obtain an expected residence time of 21 days in a 
yeast cell volume. Mammalian cell sizes range anywhere from twice to several hundred times a yeast cell 
volume, which increases the residence time of a switched state for the same number of particles. The main 
message is that for the type of symmetric circuit considered here, the spontaneous switching times appear 
to be in the order of weeks to years for rather moderate numbers of molecules in the minimal circuit. 
Longer lasting memory may occur in the case of gene regulatory networks [29] , which typically involve 
smaller overall rate constants than post-translational signaling events. 

The spontaneous switching time strongly increases with the number of phosphoacccpting sites J, as 
can be seen in Fig. [7] Fits to the numerical data of Fig. [7] suggest r ~ a exp[N(g — g c ) 2 (f3 + 7 J 3 / 2 )] , where 
(3 < (slightly), 7 > 0, and a > exhibits a weak (N, J)-dependence. It is conceivable that cellular 
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circuits possess switched states spanning a wide range of persistence times and that the evolutionary 
process acts on the longevity of such states by changing J. 

A recent numerical study reports residence times of several decades for a bistable 
(auto)phosphorylation circuit implicated to function in long term potentiation [35]. The circuit involves 
calcium/calmodulin-dependent protein kinase II (CaMKII) with J = 6. Despite an architecture that differs 
considerably from the circuit considered here, the authors also find an exponential iV-dependence, predicted 
in [29] to be a generic form for domain-switching processes of this type by arguments similar to ours. 

Conclusions 

The circuit studied in this contribution represents a situation in which competing feedbacks (enhancing 
phosphorylation and, simultaneously, dcphosphorylation) impinge on a protein with multiple 
phosphorylation sites. The literal reading of the model assumes a bifunctional enzyme, able to switch 
states (between kinase and phosphatase activity) according to its degree of phosphorylation. On another 
interpretation the model represents a contraction of indirect feedbacks that are mediated by antagonistic 
pathways regulated by the target substrate of the circuit. The extent to which this interpretation is 
appropriate depends on the sensitivities and delays of the intervening pathways. The primary value of this 
contribution is to present, on a biologically motivated example, a framework (presented in detail 
elsewhere [16]) for computing the structure, scaling, and magnitudes of fluctuations and residence times for 
a class of stochastic switching circuits, along with analytical results for the symmetric case. 

We show that multisite phosphorylation of a target protein in the linear operating regime of 
enzymatic catalysis generates bistability when combined with positive feedback. The change from 
monostability to bistability upon variation of the pertinent parameters can be characterized in terms of 
statistical mechanics as a non-equilibrium phase transition. The phase structure of the mean values, 
including the transition between monostablc and bistable regions, is well approximated by the classical 
detailed balance equations of chemical kinetics. Among the parameters controlling system behavior is the 
number of target (substrate) molecules, not just the numbers of kinase and phosphatase molecules. This is 
worth emphasizing because the same number strongly affects the persistence times of the switch. Thus, in 
a molecular circuit like the one considered here, the number of target molecules influences both the 
existence of the switch and the likelihood that fluctuations cause spontaneous transitions between its 
stationary states. 

The analytical framework wc constructed for the fully stochastic process correctly predicts the phase 
diagram, fluctuation spectrum, and large deviation properties for a wide range of parameters as obtained 
from simulations. We summarize a few observations that follow from this analysis. 

1. The behavior of multisite phosphorylation is often characterized in terms of the stationary fraction of 
fully phosphorylatcd form only. In this order parameter, multisite phosphorylation without feedback 
(where switching is of the "ultrasensitive" form) appears worse than hyperbolic in switching efficiency 
for any j > 1, and for j ^ oo the response curve becomes a right-shifted hyperbola [38] (see also 
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Eq-BJ, justifying the label "thresholding device". In Fig. [5J however, the order parameter is the 
stationary difference between the occupancies of the fully phosphorylated and unphosphorylated 
forms. In this rendition, the behavior of multisite phosphorylation increases switching efficiency (as 
defined in [38] ) for any j > 1 (though not nearly as dramatically as with feedback) and does not 
exhibit a threshold. These are not contradictory observations, but they illustrate the importance of 
the choice of observable (order parameter). 

2. With the addition of feedback, the stochastic description of the substrate population evidences a 
non-equilibrium phase transition between monostable and bistable regimes (corresponding to the 
saddle- node bifurcation in the deterministic picture). In the symmetric case, as is intuitively obvious, 
the monostable regime has all phosphoforms represented with equal probability, whereas the bistable 
regime has two degenerate distributions peaked at the fully phosphorylated or fully dcphosphorylated 
forms. In the fully symmetric case the phase transition is second-order, as in the Curie model of 
ferromagnetism. Interestingly, the ferromagnetic analogy breaks down in the limit J — > oo. This limit 
is relevant in that the finite- J behavior converges to it for increasing feedback at strengths deep 
inside the bistable regime. Any form of asymmetry considered here, circuit-wise or parameter-wise, 
leads to a first-order phase transition, as in boiling water. 

3. The types of phase transition arc helpful in framing questions about switch memory, that is, the 
system's residence time in the stationary states. Consider the technically imprecise but nonetheless 
informative metaphor of a switch as a double well potential. At the lower critical point a single well 
becomes a double well, the minima representing stationary states. Intrinsic fluctuations can knock 
the system from one well into the other, corresponding to a loss of memory. The concept of memory, 
however, is meaningful only if both wells have approximately similar depth, because that is when 
observing the system in either well conveys information. This situation is realized ideally for the 
symmetric circuit and I = P, which is the second-order phase transition. If, on the other hand, the 
wells are extremely asymmetric, one much deeper than the other, the shallower well is but a rare and 
short-lived excursion from the deep one. There is little information in observing the system in the 
deep well, which is to say that there is (almost) nothing for the system to "remember" . The only 
memory that can be lost (and fast at that) is the occupancy of the shallow well. Our first-principles 
calculations from [16], which agree quite well with simulations (see caption to Fig. [7]), apply strictly 
only to the second-order case at low J and near the critical point, but we do not expect them to 
become meaningless when the transition is weakly first-order. Between the two extremes of 
second-order and strongly first-order there is an intermediate range for which the mean-field 
equations might be used to estimate ratios of spontaneous switching times from one well to the other 
in terms of the inverse ratio of residence times in the respective wells, as required by detailed balance. 

4. The number of substrate molecules plays an important control role in selecting the monostable or 
bistable regimes. We point out that the symmetric circuit has a bistable regime at constant coupling 
g as I/P is varied (see Fig. [2|, but it remains forever bistable as N is increased past g c at constant 
I/P (sec Fig. |3k). Increasing the number of substrate molecules, increases the strength of feedback 
and reaction rates, thereby favoring bistability (and reducing fluctuations). 
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5. The magnitude of fluctuations and residence times in a stationary state as a function of the coupling 
strength g strongly depend on the number of phosphoacccpting sites J. The longevity of state 
memory could thus be tuned by evolving the number of modifiable sites in the target protein of the 
circuit (in addition to changing rate constants). The present state of knowledge does not permit to 
assess whether the latter, more structural degree of freedom is actually exploited in the evolutionary 
tuning of molecular switch circuitry. One would need to identify proteins that are involved in 
switches (i.e. that exhibit bistable phosphoform distributions) and relate the longevity of their 
switched state to the number of phosphorylatablc sites. Likewise, we do not know the extent to 
which cellular decision processes hinge on a wide spectrum of memory time scales in 
post-translational signaling, but such a spectrum seems not to be difficult to achieve. 

Methods 

Monte Carlo simulations 

The Monte Carlo simulations are carried out in a simple manner that follows the process described by the 
master equation [5] A lattice of J + 1 sites (representing states or phosphoforms) is populated with N 
target molecules (particles) randomly distributed over the lattice. The particles are numbered and we keep 
track of their location. To simulate the dynamics, particles are moved to adjacent sites on the lattice at 
discrete and equally spaced time steps. At any one step, each particle has a probability (q + n.j)/(2q + N) 
of moving to the right and (q + n )/(2q + N) of moving to the left, q = I = P. If the particle is located at 
the first (or the last) site, it can only move to the right (or the left) with probability (q + n,j)/ (2q + N) (or 
(q + n )/(2q + N), respectively). The system is evolved until it attains steady-state, when all the 
measurements are made. The normalisation factor 2q + N is an inverse time unit needed to make the rates 
in Eq. [5] into probabilities. 

The definition of mean-field approximations 

The general form of master equations such as Eq. ([5]) may be written 



In Eq. ()15p the discrete argument n has been written as a subscript denoting a vector index rather than a 
functional argument, and the sum over contributing terms is written as a matrix transformation on the 
vector with components Pr ra . The matrix T is called the transfer matrix of the probability process under 
which Pr„ evolves. 

Moments of Pr, such as the mean number, are defined through expectations 




(15) 



n ' 




(16) 



7 i. 
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Through the master equation (|15p the time evolution of (n) may be calculated as 



n n n' 

The steady-state solution d(n) /dt = may be solved through Eq. (fT?]) as a balance equation between 
moments of Pr expressed on the right-hand side of the equality. In general, the moment equations are not 
closed, either for steady or non-steady states. The time dependence or steady-state condition for (n) is a 
function of quadratic moments in n, whose time dependence or steady state conditions are in turn 
functions of higher-order moments. 

The mean-field approximation truncates this hierarchy of dependencies into a closed system of 
equations for the expectations (n). In particular, because the steady-state solution to Eq. ()17[) . for the 
master equation [5] involves only quadratic moments, the relation can be closed by replacing / + nj by 
(/ + nj) and P + no by (P + no) in Eq. [5] and then evaluating the relation (fl7|) at d (n) /dt = 0. The net 
result is to replace terms such as ((/ + nj) nj) by the approximations (/ + nj) (nj). The resulting feedback 
operates only through its mean strength (/ + nj) = I + (nj), which motivates the term "mean field 
approximation" . 

The mean-field approximation of Eq. (J5]) leads to the classical equations of detailed balance with 
hopping rates expressed in terms of mean occupation numbers. Thus, the mean particle numbers in 
adjacent phosphorylation states satisfy the relation 

(n j+1 ) = x (nj) , (18) 



with the geometric factor 

I + (n,j) _ e x ' 2 + g (n,j) /N 



(19) 



P+(n ) e-^+g(n )/N 7 
by the definitions, Eq. (J7]), in the main text. 

The generating function for the steady-state master equation 

To derive the equilibrium solution for the generating function from the master equation |5]). first recall the 
definition F(z) = £ {n} YlLi «?*Pr(n). By multiplying Eq. © by nti Z T and 

summing over {n}, we 

obtain the equivalent time-evolution equation for the generating function 

, .7-1 J J 

j t F ^ - e e [( j + n -> <w) («i + j ) n z ^ Fi( ^ + h ^+0 - ( j + n ^ ^ n ^ pr (^) 

{n} j=0 i=l i=l 

J J 

+ (P + n - Soj) (n j+ i + 1) IJ ^ ipr (" - h + h+i) ~ ( p + no) n j+1 J[ z i " 1 Pr(n)] . (20) 

i=l i=l 
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Because {n} is now a variable of summation, it can be shifted in the first term of each line of Eq. (|2T))) . to 
obtain an equation without offsets in the indices, 



j-i 



{n} 3=0 



(I + nj) I -t-i - 1 ) nj - (P + no) 



-In 



ij+i 



(21) 



— n/=i an d likewise for j + 1, we may further condense Eq. (f2Tj) into 



ShlCe ^ n - lZ 



J-l 



{P + n,)j^—-{I + nj)^- 
OZj+i ozj 



-i J 



(22) 



The mean-field approximation consists in replacing the terms P + uq and I + nj by their average 
values in the configurations - self-consistently determined - which dominate the sum (|22|) . These are 
denoted P + (no) and I + (nj). Note that with this approximation, the terms involving z and d/dz are 
separated from the index sum over Y\i=i z 7 i ^ >x i n )i which may be performed to recover a differential 
equation for the scalar function F (z). Using the definitions from Eq. ([7]) in the main text, and removing an 
overall scale for time to the front, we then recast Eq. (j22|) as 



dt 



F{z) 



N 
fJ 



e -X/2 W\ 

■' N J 



J -i 



3=0 



7 r°—*±)Fi*). 

dz j+ i dzj' 



(23) 



The negative sign reflects the convergence of a finite Markov process to its stationary distribution. 

The polynomials (z^+i — Zj) in Eq. (|23p are not needed to identify a solution to dF/dt = 0, as any 
power of the variable 



CO) =^2x*Zi, 

i=0 



(24) 



is annihilated by the derivative terms in parentheses in Eq. (|23p . independently at each j. The particular 
combination 

Fi{z) = —j = 7-tt ( 25 ) 



Si=0 • 



(l-x J+1 ) 



satisfies the normalization condition F\(z = 1) = 1 for a generating function, but its first-derivative 
satisfies ^2^ =0 d/dzjFi = 1, identifying F\ as the generating function for a single particle. For N 

J 2 = 1 

particles, the generating function F(z) = F^ . which is a general relation for the non-interacting particles 
described by the mean-field approximation, and which recovers Eq. (J6]) in the main text. 
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Reviewers' comments 
Reviewer's report 1 

Ned Wingreen, Department of Molecular Biology, Princeton University 

This paper reports a careful mathematical analysis of a model biochemical circuit with feedback. The 
model consists of a protein with multiple phosphorylation sites, which can feed back to positively affect its 
own phosphorylation or dephosphorylation rates, but only when fully phosphorylated or dephosphorylated, 
respectively. An asymmetric model containing only phosphorylation feedback is also considered. The 
authors show that these simple models display non-equilibrium phase transitions between a monostable 
and a bistable regime, and show that there is good agreement between mean-field theory results and exact 
simulations. Importantly, the rates of phosphorylation and dephosphorylation are taken to be linear 
functions of enzyme density. The bistability emerges instead from the assumption that full phosphorylation 
(or dephosphorylation) of multiple sites is required for feed back. 

Overall, the paper makes the case that even very simple biochemical circuits can display bistability, with 
potentially very slow switching between distinct states. An important point raised in the paper is that the 
concentration of proteins is an essential parameter in determining the regime of behavior. With increasing 
protein concentration, the relative strength of feedback becomes stronger, and the system is driven into the 
regime of bistability. This could be important in biological systems where protein concentrations can be 
actively controlled, e.g. by transcriptional regulation or by regulated protein degradation. 

In the model section, the distinction between "ultrasensitive" and "threshold-like" behavior is not 
immediately obvious. I recommend a figure showing showing examples of these two types of behavior. 

Author response: We have added what is now Figure 2 to illustrate both forms of switching, but also to 
convey a wealth of further information to which we refer at various points in the manuscript. 

From a biological point of view, it would be helpful to relate the switching time estimate to real biological 
parameters. Even a simple example that ends up with a estimated switching time would be helpful. 

Author response: We have added an extensive paragraph discussing and estimating switching times. 



Reviewer's report 2 

Sergei Maslov, Brookhaven National Laboratories 

The manuscript contains an exhaustive analytical and numerical study of the stochastic dynamics of a 
multistcp phosphorylation/dcphosphorylation of a single substrate in the presence of a positive feedback. 
From the mathematical standpoint the results arc as complete as they could ever be: even the 
multi- variable generating function was computed (albeit in the mean-field limit). Expectedly, under 
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favorable conditions the positive feedback generates bistability which is also analyzed by the authors in 
great detail. 

However, in my opinion authors should make a better case for the biological mcaningfulncss of these results. 

Is there any evidence that the (possibly indirect) feedback in real-life systems of this type has a positive 
sign? Could authors name at least one concrete example where bistability of 
phosphorylation/dcphosphorylation of a single substrate is at least suspected? 

Author response: We have added a further example of a protein with bifunctional kinase / phosphatase 
activity. It is not known, however, whether these systems have switch character. We point to reference [26] 
as a recent example suggesting competition between antagonistic pathways in cellular decision processes. 
Again, it is unclear whether such antagonism converges on a shared circuit causing bistability along the 
lines we consider in our paper. 

Since the case of infinite J has no biological meaning I would recommend removing the discussion of its 
peculiar mathematical properties from the manuscript. 

Author response: We disagree. We now explain in the text (as is clear from the pertinent figure) that the 
infinite- J behavior is the asymptote for all finite- J cases deep inside the bistable regime. (The finite- J 
curves converge to the infinite-J curve at large g.) It is, therefore, of interest to understand the infinite-J 
case. 

What are the advantages and disadvantages of different values of J with respect to signal to noise ratio. In 
other words, are there any biological lessons to be learned from Figure 6? 

Author response: We have added a brief paragraph in the pertinent section of the manuscript to verbally 
point to the fact shown in Figure 6 that noise decreases with J at a given distance from the transition. We 
also point out that for a given J noise increases past the transition when J is large and decreases when J is 
small. We don't know whether this has biological significance. 

Since the "average residence time" r is a very important characteristic of any bistable switch used as a 
memory device could authors attempt to estimate it for some biologically realistic microscopic parameters? 
For example, in Figure 7 for, say, TV = 100 molecules would r be of order of minutes, seconds, milliseconds? 

Author response: We have added an extensive paragraph addressing this issue (which was also raised by 
Dr. Wingreen). 

Reviewer's report 3 

Artem Novozhilov, National Center for Biotechnology Information ( nominated by Eugene V Koonin, 
National Center for Biotechnology Information, NIH) 

The authors formulate and analyze a stochastic model of a molecular switch frequently found in the 
molecular control circuitry of cells. This model circuit has been a subject of extensive research during the 
last decade. Within the framework of deterministic models, it was shown that similar models can exhibit 
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bistability and hysteresis, which might be responsible for some of all-or-none irreversible processes in the 
cells. The main novelty introduced by the authors is taking into account intrinsic stochasticity of the 
(de)phosporylation processes and analytical analysis of the resulting stochastic mathematical model (which 
is in remarkable agreement with the simulation results) . Applying formalism from the theory of 
non-equilibrium phase transitions the authors obtain phase diagram (stratification of the parameter space 
into mono- or bistable domains) for arbitrary parameters and fluctuation spectrum for the symmetric 
scheme with equal rates of kinase and phosphatase activity. This is an important contribution because 
stochastic consideration not only allow one to obtain the phase diagram in terms of mean system behavior, 
but also to estimate such probabilistic characteristics as the variance of the order parameter and mean 
residence time of the substrate population in one of the two possible states, thus providing some insight 
how macroscopic behavior can emerge from stochastic molecular interactions. 

However, I have some comments and questions concerning the text. 

I think that some of the statements presented in the Conclusion section do not really follow from the 
results and discussion in the main text. The authors claim that "The primary value of this contribution is 
to present a thorough and rigorous stochastic analysis of a switching circuit" is disputable. 

(a) : The derivation of the mathematical results in the case of the mean-field analysis is, in my opinion, 
quite fragmentary and could be supplemented with a Mathematical Appendix. 

(b) : The results for fluctuation in the order parameter and for residence time are presented without any 
derivation with [16] to unpublished work. These results also could be included in Mathematical Appendix. 

(c) : The results for fluctuation in the order parameter and for residence time are presented and discussed 
only in the case of the symmetric feedback and P = I (loosely speaking, not the most probable case). Is it 
possible to obtain analytical estimates of these characteristics in the general case? 

(d) : Thus, I presume that it would be more appropriate to state that the primary value of the author's 
contribution is to illustrate, on a biologically relevant example, the analytical framework presented 
elsewhere ( [16]) and show that in some particular cases it is possible to calculate fluctuations and 
residence times. 

(e) : I, hence, would also question conclusion 6 (here, the way it is stated in the main text, equal rates 
of(de)phosphorylation are not enough for a second-order phase transition, one should also have the 
symmetric feedback) and . . . 

(f) : ... conclusion 7, which, being quite a general assertion on the average residence times, does not follow 
from the results presented only for the symmetric scheme and P = I. 

Author response: (a): We have added two detailed appendices in regard. One appendix explains the idea of 
mean-field theory. The other derives in detail the generating function F(z). 

(b): We understand the frustration of references to unpublished work, but we feel we cannot do justice to 
the technical details of the field-theory formalism by relegating them into an appendix. The appendix would 
become a manuscript in its own right. A preprint of the paper will be available shortly and we hope we can 
link to it from the web-published version of the present article. We believe that the presently revised and 
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extended version of this manuscript decreases its dependency on the forthcoming paper to an acceptable 
level. 

(c) : We are at the moment unable to provide analytical calculations for residence times in the asymmetric 
case. In point 3 of the Conclusions we now discuss issues related to residence times in the asymmetric case 
more extensively. 

(d) : We agree with the reviewer's assessment regarding the primary value of our contribution to the point 
of paraphrasing his comment in our Conclusions. 

(e) : The second-order phase transition hinges on parameter and circuit symmetry. We are now more 
explicit throughout the manuscript about the symmetries we refer to. 

(f) : We have reworded what was conclusion 7. 

The authors could elaborate on the hysteresis (or path dependence) phenomenon (of course, in terms of the 
order parameter) which can be seen from Figure 4b. where two critical couplings arc found. In connection 
with this, it could be helpful to produce figures similar to Fig. 5 but for P ^ I and both symmetric and 
asymmetric feedbacks. This additional plot(s) can also clarify for a biologist the difference between first- 
and second- order phase transitions. I also suggest to redraw the plot in Fig. 5 such that both branches of 
the order parameter are shown when g > gc , thus illustrating the concept of bistability. 

Author response: The newly added Figure 2 addresses some of these requests and illustrates the concept of 
bistability. We have added wording to the caption of Figure 4 (new numbering) to remind the reader that 
the modulus folds both symmetric branches into one. From the viewpoint of hysteresis, asymmetries affect 
the shape details of the curves but dont impact any essential aspects of the message conveyed by Figure 2. 
We have added what is now Figure 4 to show the bistability transitions for the symmetric circuit with 
asymmetric I j P. This illustrates, as the reviewer requested with biologists in mind, the difference between 
first and second-order transitions. The Figure complements both Figure 3a (phase portrait) and Figure 5 
(which is the I = P case compared against simulations and for varying J). We have not added a similar 
figure for the asymmetric circuit, as the phase portrait, Figure 3b, suffices. The addition of Figure 4 has 
necessitated a few additional equations in the theoretical analysis section. 

The part of the paper dedicated to the average residence time is too short considering its importance. It 
would be very interesting to see residence times in other cases (e.g., the symmetric feedback and P =/= I ). 

Author response: We have now addressed some of these issues, as indicated in our responses to similar 
points raised by other reviewers. 

It is indicated in the text (the last paragraph of the Theoretical Analysis section) that "function f(g, J) is 
independent of N" but parameter g, as defined in the text, depends on N . Can you clarify this point? 

Author response: To meaningfully compare residence times for different values of N at fixed J we must also 
fix the value of g, which is done by adjusting v 'IP to compensate for the fact that N affects the coupling, 
g = N/^/IP. The phrase "f(g, J) is independent of N at fixed g and J" expresses exactly this, but we have 
unpacked it in the caption to Figure 7 (new numbering). 
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Figure 7 gives an impression that the mean residence time depends only on N and J, and is independent of 
kinase activity, phosphatase activity, and the type of the feedback. Is it really so? 

Author response: See our comments to the previous point. 
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Figures 

Figure 1 - Multisite phosphorylation and feedback structure. 

(a) This panel depicts the basic phosphorylation chain without feedback in which a target protein with J 
sites is phosphorylatcd by a kinase I and dcphosphorylatcd by a phosphatase P. The ordered succession of 
phosphorylations yields J + l modification states, labelled 0, 1, . . ., J. (b) The fully phosphorylatcd target 
protein relays a signal into pathways that eventually feed back on the phosphorylation chain, (c) 
Simplification of (b) in which the fully phosphorylated target protein acquires kinase activity and directly 
feeds back on the chain. We refer to this network configuration as the asymmetric circuit, (d) Schematic of 
the network with symmetric feedback in which the substrate protein is bifunctional, whereby the fully 
(dc) phosphorylated form catalyzes (de)phosphorylation of its own precursors. 

Figure 2 - Ultrasensitive and bistable behavior in the symmetric feedback circuit with J = 2. 

The ordinate is the order-parameter (nj — no) /N measuring the macroscopic variable of interest. The 
abscissa measures the relative difference between kinase and phosphatase numbers. In this representation, 
all kinase to phosphatase proportions are compressed symmetrically between -1 and 1. As a result, the 
hyperbolic behavior of a single phosphorylation / dephosphorylation loop (J = 1) without feedback is a 
straight line, the dotted diagonal. All other curves are for J = 2, which is the minimal ultrasensitive case 
capable of bistability. The blue curve shows the ultrasensitivity of the chain without feedback. As J — > oo, 
the slope of the sigmoidal approaches 2 at the midpoint I = P (not shown). The blue curve already covers 
nearly 50% of the ultrasensitivity attainable at J — oo. Ultrasensitivity is enhanced by feedback (green 
curve) , when approaching the threshold for bistability from below. The strength of feedback is reported in 
terms of g = N/y/ IP, where TV is the number of substrate molecules (see text for details). In the case of 
the green curve g = 2.85. The slope at the midpoint is always oo at the critical point for the onset of 
bistability in the presence of feedback. Above the threshold (red curve; g = 5) there is an intermediate 
range of (I — P) / (I + P) with two stable solutions. The dots connecting the two branches indicate a 
sudden change in stationary phosphorylation state as one of the branches ceases to exist when 
(7 — P) I (I + P) passes outside the bistable range. This creates hysteresis. 
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Figure 3 - Phase diagrams. 

(a) Symmetric feedback circuit as in Fig. [TJi. (b) Asymmetric feedback circuit (from the fully 
phosphorylated J-state) as in Fig. Q];. The coupling parameter g — N/^ IP, the exogenous catalytic 
strength P/I, and the number of phosphoacceptor sites J are being varied. Symbols are simulation results 
(N = 1000) and solid lines are analytical results from the mean-field theory for J = 2, 3, 5 (solid, dash, 
dot). The discrepancies arise from fluctuations, which the mean-field theory ignores. In the circuit with 
symmetric feedback, there is one critical coupling strength g c for each ratio of phosphatase to kinase 
numbers. (The phase diagram for symmetric feedback is itself symmetric about in log (I / P). Only half of 
the diagram is shown in panel (a); the symmetric half is obtained by reflection about the vertical axis.) As 
g is increased beyond g c , the system exhibits two stable distributions of phosphorylation states peaked at 
the end states of the chain. Below the critical number, the target molecules are mostly unphosphorylatcd 
(P/I > 1) and the system remains in this state as it becomes bistable. Within the bistable regime, the 
system could be prepared in the mostly phosphorylated state, in which it persists as the number of target 
molecules is increased. Yet, as the number of target molecules is decreased, the mostly phosphorylated 
state loses stability abruptly and the system shifts to the mostly unphosphorylated state. This is a 
first-order phase transition in statistical mechanics. In contrast, for the symmetric feedback circuit and 
P/I = 1 (the leftmost point on the abscissa in panel a), the transition is second-order, that is, continuous 
in the phosphorylation state. This is shown in Fig. O below. Panel (a) reveals that, once bistable, the 
symmetric circuit never loses bistability again as N is further increased. In the language of statistical 
mechanics, only a lower critical coupling exists. Notice that for a fixed coupling g and increasing P/I the 
symmetric system loses bistability again, as illustrated in Fig. O Unlike the symmetric case, the 
asymmetric circuit exhibits a window of bistability (lower and upper critical couplings) in N. For a 
suitable P/I, an increase in the number of target molecules N drives the system through a second 
threshold at which bistability disappears. If the system was mostly unphosphorylated, it now undergoes a 
(first-order) phase transition to the mostly phosphorylated state. 

Figure 4 - £(<?) as obtained from inverting Eq.[9]for J = 2. 

At a fixed ratio of kinase to phosphatase numbers, exp A, the abscissa measures the total number of 
substrate molecules N in terms of g. The ordinate, £ = logs, expresses (at fixed A and J) the same 
information as the order parameter (nj — uq) /N by virtue of Eq. [8] The colors indentify different values of 
A, indicated at the top on the right hand side. The negative sign means that there is more phosphatase 
than kinase in the system, and the only stable distribution below the critical point has (nj) < (no), i.e. 
x < 1. The blue curve for A = depicts the second-order phase transition (also rendered in Fig. [51 where it 
is compared against simulations for different J). At g c = 3 the x = 1 solution becomes unstable (dotted 
continuation), branching off continuously into two stable stationary states. Parameter asymmetry (A ^ 0) 
leads to a discontinous change of the order parameter (first-order phase transition), which becomes 
apparent when the system is prepared in the upper branch and the control parameter g is decreased below 
critical. At that point, the system jumps to the stationary state on the lower curve of the corresponding 
color. Again, dotted segments indicate the unstable solution. 
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Figure 5 - Order parameter \(nj — n ) \ /N for the case I = P in the symmetric circuit. 

The modulus | ■ | folds the two symmetric branches (corresponding to the two stationary state distributions) 
into one. The solid curves show the analytical results from the mean-field theory and the symbols represent 
simulations. J + 1 = [5, 10, 100] is shown in [blue, green, red] for the mean-field theory and as [+,o, x] for 
simulations. Target molecule numbers used in the simulations are, respectively, N = [4000,2000,400]. 

Figure 6 - Fluctuations in the order parameter. 

The fits (solid lines) are leading order expansions about the mean-field solution obtained through the 
field-theory formalism (see text). The straight line has slope 1/g. Symbols, J + 1 values, and colors are as 
in Fig. [5l 

Figure 7 - Residence times r in the switched state as a function of J, N, and the distance from the 
critical point g — g c . 

In all cases the circuit has symmetric feedback and / = P. (a): logr vs. N for J = 2, 3, 4, 5 based on 



Monte-Carlo simulations. To meaningfully compare residence times, the coupling g = N/vIP is kept 

constant by varying the external kinase and phosphatase numbers appropriately, I = P = N/g. To 
compare across J, the distance from the critical point, g — g c , is held constant at 1 (recall that 
g c = ( J + l)/( J - 1)). The green curve is for J = 2 and g = 4.0862 (from setting £ = 1 in Eq. [H)|). This is 
the case calculated from first principles in [16]. The best fit to the slope of the green simulation data is 
logr ~ N * 0.023, which is very close to our theoretical prediction logr ~ N * 0.021. (b): logr vs. g — g c 
for different Js at N = 50. Curves from bottom to top are for J = 2,3, 4, 5, 6, 7, 8, 9. 
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